AITopics | training mode

Collaborating Authors

training mode

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

RecommendationModels

Neural Information Processing SystemsFeb-11-2026, 16:32:29 GMT

Although synchronous AR training is designed to have higher training efficiency,asynchronous PStraining would beabetter choice for training speed when there are stragglers (slow workers) in the shared cluster, especially under limited computing resources.

artificial intelligence, machine learning, staleness, (17 more...)

Neural Information Processing Systems

Country:

Europe > Czechia > Prague (0.05)
Asia > Middle East > Jordan (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

An Investigation of Batch Normalization in Off-Policy Actor-Critic Algorithms

Wang, Li, Sudun, null, Zhang, Xingjian, Wu, Wenjun, Huang, Lei

arXiv.org Artificial IntelligenceSep-30-2025

Batch Normalization (BN) has played a pivotal role in the success of deep learning by improving training stability, mitigating overfitting, and enabling more effective optimization. However, its adoption in deep reinforcement learning (DRL) has been limited due to the inherent non-i.i.d. In this paper, we argue that, despite these challenges, BN retains unique advantages in DRL settings, particularly through its stochasticity and its ability to ease training. When applied appropriately, BN can adapt to evolving data distributions and enhance both convergence speed and final performance. To this end, we conduct a comprehensive empirical study on the use of BN in off-policy actor-critic algorithms, systematically analyzing how different training and evaluation modes impact performance. We further identify failure modes that lead to instability or divergence, analyze their underlying causes, and propose the Mode-A ware Batch Normalization (MA-BN) method with practical actionable recommendations for robust BN integration in DRL pipelines. We also empirically validate that, in RL settings, MA-BN accelerates and stabilizes training, broadens the effective learning rate range, enhances exploration, and reduces overall optimization difficulty. Batch Normalization (BN) (Ioffe & Szegedy, 2015) has been a foundational technique in deep learning, playing a critical role in improving training stability(Santurkar et al., 2018), introducing stochasticity (Shekhovtsov & Flach, 2018; Huang et al., 2020), and enabling domain adaptation(Wang et al., 2020; Schneider et al., 2020). It has become a milestone in the development of deep neural networks due to its effectiveness in mitigating overfitting.

artificial intelligence, machine learning, statistics, (14 more...)

arXiv.org Artificial Intelligence

2509.2375

Country: Asia > China (0.28)

Genre: Research Report > New Finding (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.74)

Add feedback

From Research to Reality: Feasibility of Gradient Inversion Attacks in Federated Learning

Valadi, Viktor, Åkesson, Mattias, Östman, Johan, Toor, Salman, Hellander, Andreas

arXiv.org Artificial IntelligenceAug-28-2025

Gradient inversion attacks have garnered attention for their ability to compromise privacy in federated learning. However, many studies consider attacks with the model in inference mode, where training-time behaviors like dropout are disabled and batch normalization relies on fixed statistics. In this work, we systematically analyze how architecture and training behavior affect vulnerability, including the first in-depth study of inference-mode clients, which we show dramatically simplifies inversion. To assess attack feasibility under more realistic conditions, we turn to clients operating in standard training mode. In this setting, we find that successful attacks are only possible when several architectural conditions are met simultaneously: models must be shallow and wide, use skip connections, and, critically, employ pre-activation normalization. We introduce two novel attacks against models in training-mode with varying attacker knowledge, achieving state-of-the-art performance under realistic training conditions. We extend these efforts by presenting the first attack on a production-grade object-detection model. Here, to enable any visibly identifiable leakage, we revert to the lenient inference mode setting and make multiple architectural modifications to increase model vulnerability, with the extent of required changes highlighting the strong inherent robustness of such architectures. We conclude this work by offering the first comprehensive mapping of settings, clarifying which combinations of architectural choices and operational modes meaningfully impact privacy. Our analysis provides actionable insight into when models are likely vulnerable, when they appear robust, and where subtle leakage may persist. Together, these findings reframe how gradient inversion risk should be assessed in future research and deployment scenarios.

artificial intelligence, machine learning, statistics, (13 more...)

arXiv.org Artificial Intelligence

2508.19819

Country: Europe > Sweden (0.69)

Genre: Research Report > New Finding (0.93)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Add feedback

GBA: AT uning-free Approach to Switch between Synchronous and Asynchronous Training for Recommendation Models

Neural Information Processing SystemsAug-18-2025, 10:59:59 GMT

Although synchronous AR training is designed to have higher training efficiency, asynchronous PS training would be a better choice for training speed when there are stragglers (slow workers) in the shared cluster, especially under limited computing resources.

artificial intelligence, machine learning, training mode, (17 more...)

Neural Information Processing Systems

Country:

Europe > Czechia > Prague (0.04)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

ProFuser: Progressive Fusion of Large Language Models

Shi, Tianyuan, Wan, Fanqi, Huang, Canbin, Quan, Xiaojun, Li, Chenliang, Yan, Ming, Zhang, Ji

arXiv.org Artificial IntelligenceAug-9-2024

While fusing the capacities and advantages of various large language models (LLMs) offers a pathway to construct more powerful and versatile models, a fundamental challenge is to properly select advantageous model during the training. Existing fusion methods primarily focus on the training mode that uses cross entropy on ground truth in a teacher-forcing setup to measure a model's advantage, which may provide limited insight towards model advantage. In this paper, we introduce a novel approach that enhances the fusion process by incorporating both the training and inference modes. Our method evaluates model advantage not only through cross entropy during training but also by considering inference outputs, providing a more comprehensive assessment. To combine the two modes effectively, we introduce ProFuser to progressively transition from inference mode to training mode. To validate ProFuser's effectiveness, we fused three models, including vicuna-7b-v1.5, Llama-2-7b-chat, and mpt-7b-8k-chat, and demonstrated the improved performance in knowledge, reasoning, and safety compared to baseline methods.

inference mode, source model, training mode, (17 more...)

arXiv.org Artificial Intelligence

2408.04998

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Asia > China (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.51)

Add feedback

Maze Discovery using Multiple Robots via Federated Learning

Ranasinghe, Kalpana, Madushanka, H. P., Scaciota, Rafaela, Samarakoon, Sumudu, Bennis, Mehdi

arXiv.org Artificial IntelligenceJun-25-2024

This work presents a use case of federated learning (FL) applied to discovering a maze with LiDAR sensors-equipped robots. Goal here is to train classification models to accurately identify the shapes of grid areas within two different square mazes made up with irregular shaped walls. Due to the use of different shapes for the walls, a classification model trained in one maze that captures its structure does not generalize for the other. This issue is resolved by adopting FL framework between the robots that explore only one maze so that the collective knowledge allows them to operate accurately in the unseen maze. This illustrates the effectiveness of FL in real-world applications in terms of enhancing classification accuracy and robustness in maze discovery tasks.

classification model, maze, robot, (12 more...)

arXiv.org Artificial Intelligence

2407.01596

Country: Europe > Finland > Northern Ostrobothnia > Oulu (0.06)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)
Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (0.41)

Add feedback

An Off-Policy Reinforcement Learning Algorithm Customized for Multi-Task Fusion in Large-Scale Recommender Systems

Liu, Peng, Xu, Cong, Zhao, Ming, Zhu, Jiawei, Wang, Bin, Ren, Yi

arXiv.org Artificial IntelligenceMay-6-2024

As the last critical stage of RSs, Multi-Task Fusion (MTF) is responsible for combining multiple scores outputted by Multi-Task Learning (MTL) into a final score to maximize user satisfaction, which determines the ultimate recommendation results. Recently, to optimize long-term user satisfaction within a recommendation session, Reinforcement Learning (RL) is used for MTF in the industry. However, the off-policy RL algorithms used for MTF so far have the following severe problems: 1) to avoid out-of-distribution (OOD) problem, their constraints are overly strict, which seriously damage their performance; 2) they are unaware of the exploration policy used for producing training data and never interact with real environment, so only suboptimal policy can be learned; 3) the traditional exploration policies are inefficient and hurt user experience. To solve the above problems, we propose a novel method named IntegratedRL-MTF customized for MTF in large-scale RSs. IntegratedRL-MTF integrates off-policy RL model with our online exploration policy to relax overstrict and complicated constraints, which significantly improves its performance. We also design an extremely efficient exploration policy, which eliminates low-value exploration space and focuses on exploring potential high-value state-action pairs. Moreover, we adopt progressive training mode to further enhance our model's performance with the help of our exploration policy. We conduct extensive offline and online experiments in the short video channel of Tencent News. The results demonstrate that our model outperforms other models remarkably. IntegratedRL-MTF has been fully deployed in our RS and other large-scale RSs in Tencent, which have achieved significant improvements.

exploration policy, integratedrl-mtf, recommender system, (16 more...)

arXiv.org Artificial Intelligence

2404.17589

Country:

Asia > China > Beijing > Beijing (0.05)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
South America > Brazil (0.04)
(9 more...)

Genre: Research Report > New Finding (0.48)

Industry: Information Technology > Services (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

ACQ: Improving Generative Data-free Quantization Via Attention Correction

Li, Jixing, Guo, Xiaozhou, Dai, Benzhe, Gong, Guoliang, Jin, Min, Chen, Gang, Mao, Wenyu, Lu, Huaxiang

arXiv.org Artificial IntelligenceJul-29-2023

Data-free quantization aims to achieve model quantization without accessing any authentic sample. It is significant in an application-oriented context involving data privacy. Converting noise vectors into synthetic samples through a generator is a popular data-free quantization method, which is called generative data-free quantization. However, there is a difference in attention between synthetic samples and authentic samples. This is always ignored and restricts the quantization performance. First, since synthetic samples of the same class are prone to have homogenous attention, the quantized network can only learn limited modes of attention. Second, synthetic samples in eval mode and training mode exhibit different attention. Hence, the batch-normalization statistics matching tends to be inaccurate. ACQ is proposed in this paper to fix the attention of synthetic samples. An attention center position-condition generator is established regarding the homogenization of intra-class attention. Restricted by the attention center matching loss, the attention center position is treated as the generator's condition input to guide synthetic samples in obtaining diverse attention. Moreover, we design adversarial loss of paired synthetic samples under the same condition to prevent the generator from paying overmuch attention to the condition, which may result in mode collapse. To improve the attention similarity of synthetic samples in different network modes, we introduce a consistency penalty to guarantee accurate BN statistics matching. The experimental results demonstrate that ACQ effectively improves the attention problems of synthetic samples. Under various training settings, ACQ achieves the best quantization performance. For the 4-bit quantization of Resnet18 and Resnet50, ACQ reaches 67.55% and 72.23% accuracy, respectively.

artificial intelligence, machine learning, synthetic sample, (18 more...)

arXiv.org Artificial Intelligence

2301.07266

Country: Asia > China > Beijing > Beijing (0.05)

Genre: Research Report > New Finding (0.34)

Industry: Information Technology > Security & Privacy (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Security & Privacy (0.68)

Add feedback

Deploy Amazon SageMaker Autopilot models to serverless inference endpoints

#artificialintelligenceDec-8-2022, 17:27:00 GMT

Amazon SageMaker Autopilot automatically builds, trains, and tunes the best machine learning (ML) models based on your data, while allowing you to maintain full control and visibility. Autopilot can also deploy trained models to real-time inference endpoints automatically. If you have workloads with spiky or unpredictable traffic patterns that can tolerate cold starts, then deploying the model to a serverless inference endpoint would be more cost efficient. Amazon SageMaker Serverless Inference is a purpose-built inference option ideal for workloads with unpredictable traffic patterns and that can tolerate cold starts. Unlike a real-time inference endpoint, which is backed by a long-running compute instance, serverless endpoints provision resources on demand with built-in auto scaling.

container, endpoint, inference container, (12 more...)

#artificialintelligence

Genre: Press Release (0.31)

Industry: Retail > Online (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Curriculum-based Asymmetric Multi-task Reinforcement Learning

Huang, Hanchi, Ye, Deheng, Shen, Li, Liu, Wei

arXiv.org Artificial IntelligenceNov-7-2022

We introduce CAMRL, the first curriculum-based asymmetric multi-task learning (AMTL) algorithm for dealing with multiple reinforcement learning (RL) tasks altogether. To mitigate the negative influence of customizing the one-off training order in curriculum-based AMTL, CAMRL switches its training mode between parallel single-task RL and asymmetric multi-task RL (MTRL), according to an indicator regarding the training time, the overall performance, and the performance gap among tasks. To leverage the multi-sourced prior knowledge flexibly and to reduce negative transfer in AMTL, we customize a composite loss with multiple differentiable ranking functions and optimize the loss through alternating optimization and the Frank-Wolfe algorithm. The uncertainty-based automatic adjustment of hyper-parameters is also applied to eliminate the need of laborious hyper-parameter analysis during optimization. By optimizing the composite loss, CAMRL predicts the next training task and continuously revisits the transfer matrix and network weights. We have conducted experiments on a wide range of benchmarks in multi-task RL, covering Gym-minigrid, Meta-world, Atari video games, vision-based PyBullet tasks, and RLBench, to show the improvements of CAMRL over the corresponding single-task RL algorithm and state-of-the-art MTRL algorithms. The code is available at: https://github.com/huanghanchi/CAMRL

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2211.03352

Country:

Asia > Vietnam > Hanoi > Hanoi (0.04)
Asia > Singapore (0.04)
North America > United States (0.04)
(4 more...)

Genre: Research Report (0.64)

Industry:

Leisure & Entertainment > Games > Computer Games (1.00)
Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback